Goto

Collaborating Authors

 reserve price


Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Neural Information Processing Systems

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers' valuations \new{for} an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers' valuations, i.e., buyers' preferences. The seller's goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers' heterogeneous preferences. Given the seller's goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller's learning policy.


A Bandit Learning Algorithm and Applications to Auction Design

Neural Information Processing Systems

We consider online bandit learning in which at every time step, an algorithm has to make a decision and then observe only its reward. The goal is to design efficient (polynomial-time) algorithms that achieve a total reward approximately close to that of the best fixed decision in hindsight. In this paper, we introduce a new notion of $(\lambda,\mu)$-concave functions and present a bandit learning algorithm that achieves a performance guarantee which is characterized as a function of the concavity parameters $\lambda$ and $\mu$. The algorithm is based on the mirror descent algorithm in which the update directions follow the gradient of the multilinear extensions of the reward functions. The regret bound induced by our algorithm is $\widetilde{O}(\sqrt{T})$ which is nearly optimal.




Revenue Optimization with Approximate Bid Predictions Andres Munoz Medina Google Research 76 9th Ave New York, NY10011 Sergei V assilvitskii Google Research 76 9th Ave New York, NY10011

Neural Information Processing Systems

In the context of advertising auctions, finding good reserve prices is a notoriously challenging learning problem. This is due to the heterogeneity of ad opportunity types, and the non-convexity of the objective function. In this work, we show how to reduce reserve price optimization to the standard setting of prediction under squared loss, a well understood problem in the learning community. We further bound the gap between the expected bid and revenue in terms of the average loss of the predictor. This is the first result that formally relates the revenue gained to the quality of a standard machine learned model.



Learning Optimal Reserve Price against Non-myopic Bidders

Jinyan Liu, Zhiyi Huang, Xiangning Wang

Neural Information Processing Systems

We introduce algorithms that obtain a small regret against non-myopic bidders either when the market is large, i.e., no single bidder appears in more than a small constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one.


Learning Optimal Reserve Price against Non-myopic Bidders

Jinyan Liu, Zhiyi Huang, Xiangning Wang

Neural Information Processing Systems

We introduce algorithms that obtain a small regret against non-myopic bidders either when the market is large, i.e., no single bidder appears in more than a small constant fraction of the rounds, or when the bidders are impatient, i.e., they discount future utility by some factor mildly bounded away from one.